GARCH(1,1) Models (Volatility Models)#

Goals#

  • Precisely define conditional heteroskedasticity and connect it to volatility clustering.

  • Explain why volatility (conditional variance) is modeled rather than the mean for many return series.

  • Derive and break down the GARCH(1,1) variance recursion in LaTeX.

  • Implement simulation, conditional variance recursion, and variance forecasts in low-level NumPy.

  • Visualize volatility clustering and variance forecasts using Plotly.

Conditional heteroskedasticity (quick recap)#

A return innovation \(\varepsilon_t\) is conditionally heteroskedastic if its conditional variance depends on past information:

\[ r_t = \mu + \varepsilon_t, \qquad \mathbb{E}[\varepsilon_t\mid\mathcal{F}_{t-1}] = 0, \qquad \operatorname{Var}(\varepsilon_t\mid\mathcal{F}_{t-1}) = h_t,\ \text{with } h_t \text{ time-varying}. \]

In finance, \(h_t\) is interpreted as volatility squared (conditional variance).

Why model variance (volatility) instead of the mean?#

  • Mean returns are often weakly predictable at short horizons; volatility is typically persistent.

  • Many tasks require a forecast of risk (uncertainty), not just expected return.

Common financial use cases:

  • VaR / Expected Shortfall and stress testing

  • Volatility forecasting for portfolio risk budgets and leverage control

  • Options / derivatives context (realized vs implied volatility)

  • Position sizing (e.g., target volatility strategies)

GARCH(1,1) model (LaTeX)#

GARCH extends ARCH by adding lagged conditional variance (persistence).

\[ \varepsilon_t = z_t\sqrt{h_t}, \qquad z_t\ \text{i.i.d. with}\ \mathbb{E}[z_t]=0,\ \operatorname{Var}(z_t)=1 \]
\[ h_t = \omega + \alpha\varepsilon_{t-1}^2 + \beta h_{t-1} \]

Variance equation breakdown#

\[ h_t = \underbrace{\omega}_{\text{baseline}} + \underbrace{\alpha\varepsilon_{t-1}^2}_{\text{shock/news impact}} + \underbrace{\beta h_{t-1}}_{\text{volatility persistence}} \]
  • \(\alpha\): how strongly “new information” (yesterday’s squared shock) moves volatility.

  • \(\beta\): how persistent volatility is (how slowly it decays).

Common parameter constraints#

  • \(\omega > 0\), \(\alpha\ge 0\), \(\beta\ge 0\)

  • Second-order stationarity (finite unconditional variance): \(\alpha + \beta < 1\)

Under \(\alpha + \beta < 1\), the long-run variance is: $\( \bar h = \mathbb{E}[h_t] = \frac{\omega}{1-\alpha-\beta}. \)$

Volatility clustering + market-shock intuition#

  • A big market shock at time \(t\) means \(|\varepsilon_t|\) is large.

  • That makes \(\varepsilon_t^2\) large, which pushes up \(h_{t+1}\).

  • If \(\beta\) is large, elevated variance persists for many periods, producing clusters of large moves.

Variance forecasts (multi-step)#

Given \(\mathcal{F}_t\), the one-step-ahead forecast is directly:

\[ h_{t+1\mid t} = \omega + \alpha\varepsilon_t^2 + \beta h_t. \]

For \(k\ge 2\), we use \(\mathbb{E}[\varepsilon_{t+k-1}^2\mid\mathcal{F}_t] = h_{t+k-1\mid t}\), giving the recursion: $\( h_{t+k\mid t} = \omega + (\alpha+\beta)\,h_{t+k-1\mid t}. \)$

Closed form (mean reversion to \(\bar h\)): $\( h_{t+k\mid t} = \bar h + (\alpha+\beta)^{k-1}\,(h_{t+1\mid t} - \bar h). \)$

import numpy as np
import pandas as pd

import plotly.graph_objects as go
from plotly.subplots import make_subplots
import os
import plotly.io as pio

pio.renderers.default = os.environ.get("PLOTLY_RENDERER", "notebook")
pio.templates.default = "plotly_white"

np.set_printoptions(precision=4, suppress=True)
def simulate_garch11(
    T: int,
    omega: float,
    alpha: float,
    beta: float,
    mu: float = 0.0,
    burn: int = 500,
    seed: int = 7,
):
    """Simulate a GARCH(1,1) process with Gaussian innovations."""
    if omega <= 0:
        raise ValueError("omega must be > 0")
    if alpha < 0 or beta < 0:
        raise ValueError("alpha and beta must be >= 0")
    if alpha + beta >= 1:
        raise ValueError("Need alpha + beta < 1 for finite unconditional variance in this demo")

    h_bar = omega / (1.0 - alpha - beta)
    n = int(T + burn)

    rng = np.random.default_rng(seed)
    z = rng.standard_normal(n)

    eps = np.zeros(n)
    h = np.full(n, h_bar)

    eps[0] = np.sqrt(h_bar) * z[0]
    for t in range(1, n):
        h[t] = omega + alpha * (eps[t - 1] ** 2) + beta * h[t - 1]
        eps[t] = np.sqrt(h[t]) * z[t]

    eps = eps[burn:]
    h = h[burn:]
    r = mu + eps
    return r, eps, h


def garch11_conditional_variance(eps, omega: float, alpha: float, beta: float, initial_variance: float | None = None):
    eps = np.asarray(eps, dtype=float)
    if initial_variance is None:
        if alpha + beta >= 1:
            raise ValueError("Need alpha + beta < 1 to use the unconditional variance as initialization")
        initial_variance = omega / (1.0 - alpha - beta)

    h = np.full(eps.size, float(initial_variance))
    for t in range(1, eps.size):
        h[t] = omega + alpha * (eps[t - 1] ** 2) + beta * h[t - 1]
    return h


def garch11_forecast_variance(eps_last: float, h_last: float, omega: float, alpha: float, beta: float, horizon: int):
    """Forecast h_{t+k|t} for k=1..horizon."""
    horizon = int(horizon)
    if horizon < 1:
        return np.array([], dtype=float)

    h_fore = np.zeros(horizon, dtype=float)
    h_fore[0] = omega + alpha * (eps_last**2) + beta * h_last
    for k in range(1, horizon):
        h_fore[k] = omega + (alpha + beta) * h_fore[k - 1]
    return h_fore
# Example: simulate a persistent GARCH(1,1) to show volatility clustering
T = 2500
mu = 0.0
omega = 0.02
alpha = 0.08
beta = 0.90  # alpha+beta close to 1 => strong persistence

r, eps, h = simulate_garch11(T=T, omega=omega, alpha=alpha, beta=beta, mu=mu, seed=123)

view = 800
df = pd.DataFrame(
    {
        "t": np.arange(T),
        "return": r,
        "eps_sq": eps**2,
        "h": h,
        "sigma": np.sqrt(h),
    }
).tail(view)

fig = make_subplots(
    rows=2,
    cols=1,
    shared_xaxes=True,
    vertical_spacing=0.06,
    subplot_titles=("Simulated returns", "Volatility clustering: squared returns vs conditional variance"),
)

fig.add_trace(go.Scatter(x=df["t"], y=df["return"], mode="lines", name="r_t"), row=1, col=1)
fig.add_trace(
    go.Scatter(x=df["t"], y=df["eps_sq"], mode="lines", name=r"$\varepsilon_t^2$", line=dict(width=1)),
    row=2,
    col=1,
)
fig.add_trace(
    go.Scatter(x=df["t"], y=df["h"], mode="lines", name=r"$h_t$", line=dict(width=2)),
    row=2,
    col=1,
)

fig.update_yaxes(title_text="return", row=1, col=1)
fig.update_yaxes(title_text="variance", row=2, col=1)
fig.update_xaxes(title_text="time", row=2, col=1)
fig.update_layout(height=650, legend=dict(orientation="h", yanchor="bottom", y=1.02, xanchor="left", x=0))
fig.show()
# Variance forecasts: baseline vs a one-time "shock" at the last observation
horizon = 80

eps_last = float(eps[-1])
h_last = float(h[-1])

h_fore = garch11_forecast_variance(eps_last=eps_last, h_last=h_last, omega=omega, alpha=alpha, beta=beta, horizon=horizon)

shock_multiplier = 6.0
eps_last_shocked = shock_multiplier * np.sqrt(h_last)
h_fore_shocked = garch11_forecast_variance(
    eps_last=eps_last_shocked, h_last=h_last, omega=omega, alpha=alpha, beta=beta, horizon=horizon
)

h_bar = omega / (1.0 - alpha - beta)

lookback = 250
hist_x = np.arange(T - lookback, T)
fore_x = np.arange(T, T + horizon)

fig = go.Figure()
fig.add_trace(go.Scatter(x=hist_x, y=h[-lookback:], mode="lines", name=r"historical $h_t$"))
fig.add_trace(go.Scatter(x=fore_x, y=h_fore, mode="lines", name=r"forecast (baseline)"))
fig.add_trace(go.Scatter(x=fore_x, y=h_fore_shocked, mode="lines", name=r"forecast (last shock)", line=dict(dash="dot")))
fig.add_trace(
    go.Scatter(
        x=np.concatenate([hist_x, fore_x]),
        y=np.full(hist_x.size + fore_x.size, h_bar),
        mode="lines",
        name=r"long-run $\bar h$",
        line=dict(dash="dash"),
    )
)

fig.update_layout(
    title="GARCH variance forecasts: mean reversion and the impact of a market shock",
    xaxis_title="time",
    yaxis_title="variance",
    height=500,
)
fig.show()

Notes / diagnostics#

  • Volatility persistence is largely controlled by \(\alpha + \beta\); values close to 1 create slow decay and strong clustering.

  • If \(\alpha + \beta \ge 1\), the long-run variance \(\bar h\) is not finite (often discussed as IGARCH / near-unit-root volatility).

  • Real data often has heavy tails; using a Student-t distribution for \(z_t\) is common in practice.